NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

A Survey on Statistical Theory of Deep Learning: Approximation, Training Dynamics, and Generative Models

https://doi.org/10.1146/annurev-statistics-040522-013920

Suh, Namjoon; Cheng, Guang (November 2024, Annual Review of Statistics and Its Application)

In this article, we review the literature on statistical theories of neural networks from three perspectives: approximation, training dynamics, and generative models. In the first part, results on excess risks for neural networks are reviewed in the nonparametric framework of regression. These results rely on explicit constructions of neural networks, leading to fast convergence rates of excess risks. Nonetheless, their underlying analysis only applies to the global minimizer in the highly nonconvex landscape of deep neural networks. This motivates us to review the training dynamics of neural networks in the second part. Specifically, we review articles that attempt to answer the question of how a neural network trained via gradient-based methods finds a solution that can generalize well on unseen data. In particular, two well-known paradigms are reviewed: the neural tangent kernel and mean-field paradigms. Last, we review the most recent theoretical advancements in generative models, including generative adversarial networks, diffusion models, and in-context learning in large language models from two of the same perspectives, approximation and training dynamics.
more » « less
Full Text Available
High-Dimensional Multivariate Linear Regression with Weighted Nuclear Norm Regularization

https://doi.org/10.1080/10618600.2024.2331020

Suh, Namjoon; Lin, Li-Hsiang; Huo, Xiaoming (April 2024, Journal of Computational and Graphical Statistics)

Full Text Available
AutoDiff: combining Auto-encoder and Diffusion model for tabular data synthesizing

Suh, Namjoon; Lin, Xiaofeng; Hsieh, Dai-Yin; Honarkhah, Merhdad; Cheng, Guang (December 2023, NeurIPS Workshop on SyntheticData4ML)

Full Text Available
Asymptotic Theory of \(\boldsymbol \ell _1\) -Regularized PDE Identification from a Single Noisy Trajectory

https://doi.org/10.1137/21M1398884

He, Yuchen; Suh, Namjoon; Huo, Xiaoming; Kang, Sung Ha; Mei, Yajun (September 2022, SIAM/ASA Journal on Uncertainty Quantification)

Full Text Available
A network model that combines latent factors and sparse graphs

https://doi.org/10.1002/sam.11492

Suh, Namjoon; Huo, Xiaoming; Heim, Eric; Seversky, Lee (December 2020, Statistical Analysis and Data Mining: The ASA Data Science Journal)

Abstract We propose a combined model, which integrates the latent factor model and a sparse graphical model, for network data. It is noticed that neither a latent factor model nor a sparse graphical model alone may be sufficient to capture the structure of the data. The proposed model has a latent (i.e., factor analysis) model to represent the main trends (a.k.a., factors), and a sparse graphical component that captures the remaining ad‐hoc dependence. Model selection and parameter estimation are carried out simultaneously via a penalized likelihood approach. The convexity of the objective function allows us to develop an efficient algorithm, while the penalty terms push towards low‐dimensional latent components and a sparse graphical structure. The effectiveness of our model is demonstrated via simulation studies, and the model is also applied to four real datasets: Zachary's Karate club data, Kreb's U.S. political book dataset (http://www.orgnet.com), U.S. political blog dataset , and citation network of statisticians; showing meaningful performances in practical situations.
more » « less

Search for: All records